Assessment of Graphic Processing Units (GPUs) for Department of Defense (DoD) Digital Signal Processing (DSP) Applications
نویسندگان
چکیده
In this report we analyze the performance of the fast Fourier transform (FFT) on graphics hardware (the GPU), comparing it to the best-of-class CPU implementation FFTW. We describe the FFT, the architecture of the GPU, and how general-purpose computation is structured on the GPU.We then identify the factors that influence FFT performance and describe several experiments that compare these factors between the CPU and the GPU.We conclude that the overhead of transferring data and initiating GPU computation are substantially higher than on the CPU, and thus for latencycritical applications, the CPU is a superior choice. We show that the CPU implementation is limited by computation and the GPU implementation by GPU memory bandwidth and its lack of a writable cache. TheGPU is comparatively better suited for larger FFTs withmany FFTs computed in parallel in applications where FFT throughput is most important; on these applications GPU and CPU performance is roughly on par. We also demonstrate that adding additional computation to an application that includes the FFT, particularly computation that is GPU-friendly, puts the GPU at an advantage compared to the CPU. The future of desktop processing is parallel. The last few years have seen an explosion of single-chip commodity parallel architectures that promises to be the centerpieces of future computing platforms. Parallel hardware on the desktop—new multicore microprocessors, graphics processors, and stream processors—has the potential to greatly increase the computational power available to today’s computer users, with a resulting impact in computation domains such asmultimedia, entertainment, signal and image processing, and scientific computation. Several vendors have recently addressed this need for parallel computing: Intel andAMD are producingmulticore CPUs; the IBMCell processor delivers impressive performancewith its parallel cores; and several stream processor startups, including Stream Processors Inc. and Clearspeed, are producing commercial parallel stream processors. None of these chips, however, have achieved market penetration to the degree of the graphics processor (GPU). Today’s GPU features massive arithmetic capability and memory bandwidth with superior performance and price-performance when compared to the CPU. For instance, the NVIDIA JohnD.Owens, Shubhabrata Sengupta, andDaniel Horn. “Assessment of Graphic ProcessingUnits (GPUs) for Department of Defense (DoD) Digital Signal Processing (DSP) Applications”. Technical Report ECE-CE-, Computer Engineering Research Laboratory, University of California, Davis, . http://www.ece. ucdavis.edu/cerl/techreports/-/
منابع مشابه
Design and Implementation of Digital Demodulator for Frequency Modulated CW Radar (RESEARCH NOTE)
Radar Signal Processing has been an interesting area of research for realization of programmable digital signal processor using VLSI design techniques. Digital Signal Processing (DSP) algorithms have been an integral design methodology for implementation of high speed application specific real-time systems especially for high resolution radar. CORDIC algorithm, in recent times, is turned out to...
متن کاملNumerical Simulation of a Lead-Acid Battery Discharge Process using a Developed Framework on Graphic Processing Units
In the present work, a framework is developed for implementation of finite difference schemes on Graphic Processing Units (GPU). The framework is developed using the CUDA language and C++ template meta-programming techniques. The framework is also applicable for other numerical methods which can be represented similar to finite difference schemes such as finite volume methods on structured grid...
متن کاملUsing Graphics Processing Units in an LTE Base Station
Base stations have been built from ASICs, DSP processors, or FPGAs. This paper studies the feasibility of building wireless base stations from commercial graphics processing units (GPUs). GPUs are attractive because they are widely used massively parallel devices that can be programmed in a high level language. Base station workloads are highly parallel, making GPUs a potential candidate for a ...
متن کاملReal-Time DSP-Based License Plate Character Segmentation Algorithm Using 2D Haar Wavelet Transform
The potential applications of Wavelet Transform (WT) are limitless including image processing, audio compression and communication systems. In image processing, WT is used in applications such as image compression, denoising, speckle removal, feature analysis, edge detection and object detection. The use of WT algorithms in image processing for real-time custom applications may require dedicate...
متن کاملInvestigating the Effects of Hardware Parameters on Power Consumptions in SPMV Algorithms on Graphics Processing Units (GPUs)
Although Sparse matrix-vector multiplication (SPMVs) algorithms are simple, they include important parts of Linear Algebra algorithms in Mathematics and Physics areas. As these algorithms can be run in parallel, Graphics Processing Units (GPUs) has been considered as one of the best candidates to run these algorithms. In the recent years, power consumption has been considered as one of the metr...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005